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Abstract 

We consider an inventory system in which inventory level fluctuates as a Brownian 
motion in the absence of control. The inventory continuously accumulates cost at a 
rate that is a general convex function of the inventory level, which can be negative 
when there is a backlog. At any time, the inventory level can be adjusted by a positive 
or negative amount, which incurs a fixed cost and a proportional cost. The challenge 
is to find an adjustment policy that balances the holding cost and adjustment cost to 
minimize the long-run average cost. When both upward and downward fixed costs are 
positive, our model is an impulse control problem. When both fixed costs are zero, 
our model is a singular or instantaneous control problem. For the impulse control 
problem, we prove that a four-parameter control band policy is optimal among all 
feasible policies. For the singular control problem, we prove that a two-parameter 
control band policy is optimal. 

We use a lower-bound approach, widely known as "the verification theorem", to 
prove the optimality of a control band policy for both the impulse and singular control 
problems. Our major contribution is to prove the existence of a "smooth" solution to 
the free boundary problem under some mild assumptions on the holding cost function. 
The existence proof leads naturally to a numerical algorithm to compute the optimal 
control band parameters. We demonstrate that the lower-bound approach also works 
for Brownian inventory model in which no inventory backlog is allowed. In a companion 
paper, we will show how the lower-bound approach can be adapted to study a Brownian 
inventory model under a discounted cost criterion. 
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1 Introduction 



This paper is concerned with optimal control of Brownian inventory models under the 
long-run average cost criterion. It serves two purposes. First, it provides a tutorial on the 
powerful lower-bound approach, known as the "the verification theorem", to proving the 
optimality of a control band policy among all feasible policies. The tutorial is rigorous and, 
except the standard Ito formula, self contained. Second, it contributes to the literature by 
proving the existence of a "smooth" solution to the free boundary problem with a general 
convex holding cost function. The existence proof leads naturally algorithms to compute 
the optimal control band parameters. The companion paper [14] studies the optimal control 
of Brownian inventory models under a discounted cost criterion. 

The Model Description 

In this paper and the companion paper [14], the inventory netput process is assumed to 
follow a Brownian motion with drift ^ and variance cr^. The netput process captures the 
difference between regular supplies, possibly through a long term contract, and customer 
demands. Controls are exercised on the netput process to keep the inventory at desired 
positions. The controlled process, denoted hy Z = {Z{t),t > 0}, is called the inventory 
process in this paper. For each time t > 0, Z{t) is interpreted as the inventory level at time 
t although Z{t) can be negative, in which case represents the inventory backlog at 

time t. We assume that the holding cost function /i : M ^ is a general convex function. 
Thus, Jq h{Z{s))ds is the cumulative inventory cost by time t. 

Inventory position is assumed to be adjustable, either upward or downward. All ad- 
justments are realized immediately without any leadtime delay. Each upward adjustment 
with amount > incurs a cost K -\- kS,, where K > and A: > are the fixed cost 
and the variable cost, respectively, for each upward adjustment. Similarly, each downward 
adjustment with amount ^ incurs a cost of L + £^ with fixed cost L > and variable cost 
^ > 0. The objective is to find some control policy that balances the inventory cost and 
the adjustment cost so that the long-run average total cost is minimized. 

In describing our Brownian control problems, we have used the inventory terminology 
in supply chain management. One could describe such control problems in cash flow 
management. In this case, Z{t) represents the cash amount at time t > 0. There are a 
large number of papers in the economics literature that have studied the Brownian control 
problems (e.g. Dixit [15]). Readers are referred to Stokey [29] and the references there 
for a variety of economic applications of Brownian control problems. While the discounted 
cost criterion is appropriate for cash flow management, the long-run average cost criterion 
is natural for many production/inventory problems. 

When both flxed costs K and L are positive, it is clear that non-trivial feasible control 
policies should limit the number of adjustments to be finite within any finite time interval. 
Under such a control policy, inventory is adjusted at a sequence of discrete times and 
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the resulting control problem is termed as the impulse control of a Brownian motion. 
When both fixed costs K = and L = 0, it can be advantageous for the system to make 
an "infinitesimal" amount of adjustment at any moment. Indeed, as it will be shown in 
Section 6, an optimal policy will make an uncountable number of adjustments within a finite 
time interval. The resulting control problem is termed as the singular or instantaneous 
control of a Brownian motion. In this paper, we treat impulse and singular control of a 
Brownian motion in a single framework. Conceptually, one may view the singular control 
problem as a limit of a sequence of impulse control problems as fixed costs K ^ and 
L ^ 0. Such a connection between impulse and singular control problems allow us to solve 
a mixed impulse- singular control problem (for example, K > and L = 0) without much 
additional effort. 

Non-Linear Holding Cost 

When the holding cost function h is given by 



for some constants p > and c > 0, we call h in (1.1) a linear holding cost function, 
even though h{x) in (1.1) is piecewise linear in inventory level x. With this holding cost 
function, inventory backlog cost is linear and inventory excess cost is also linear, but h{x) is 
not differentiable at a; = 0. Although many papers focused on linear holding cost function 
(e.g. [19]), there are ample applications that motivate non-linear holding cost function. 
For example, [10] and [25] studied optimal index tracking of a benchmark index when 
there are transaction costs. An impulse control problem with quadratic holding cost arises 
naturally in their studies. Quadratic holding cost and general convex holding cost also 
arise in economic papers; see, for example, [9, 24, 32]. 

Optimal Policy Structure 

For an impulse Brownian control problem under the long-run average cost criterion, we 
prove in Section 5 that a control band policy (p = {d, D, U, u} is optimal among all feasible 
policies. Under the control band policy (p, an adjustment is placed so as to bring the 
inventory up to level D when the inventory level drops to level d and to bring the inventory 
down to level U when the inventory level rises to level u. For a singular Brownian control 
problem, we show in Section 6 that the optimal policy is a degenerate control band policy 
with two free parameters D = d and U = u. When the inventory level is restricted to be 
always nonnegative, we show in Section 7 that the optimal policy for an impulse Brownian 
control problem is again a control band policy. Depending on the holding cost function h, 
this control band policy sometimes, but not always, has only three free parameters D, U 
and u that need to be characterized with the lowest boundary d = 0. Although we will 
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not explicitly study the mixed impulse-singular Brownian control problems, it is clear from 
our proofs that a degenerate control band policy with three parameters is optimal. 

The Lower-Bound Approach and the Free Boundary Problem 

This paper promotes a three-step, lower bound approach to solving Brownian control prob- 
lems under the long-run average cost criterion. In the first step, we prove Theorem 4.1 
showing that if there exist a constant 7 and a "smooth" test function / that is defined on 
the entire real line (or the positive half line when inventory is not allowed to be backlogged) 
such that / and 7 jointly satisfy some differential inequalities, then the long-run average 
cost under any feasible policy is at least 7. This theorem is formulated and proved for 
all impulse, singular and mixed impulse-singular control problems. In the second step, we 
show in Theorems 5.1 and 6.1 that for a given control band policy, its long-run average 
cost can be computed as a solution to a Poisson equation. This equation is a second or- 
der ordinary differential equation (ODE) with given boundary conditions at the boundary 
points of the band. As a part of the solution to the Poisson equation, we also obtain the 
relative value function. The relative value function can naturally be extended to the entire 
real line, but the extended function may not be continuously differential at the boundary 
points of the control band. In the third step, we search for a control band policy such that 
the corresponding relative value function can indeed be extended smoothly as a function 
/ on the entire real line. Furthermore, this smooth function /, together with the long-run 
average cost under the control band policy, satisfies the differential inequalities in step 1 
within the entire real line. Clearly, if the control band policy in step 3 can be found, it 
must be an optimal policy by Theorem 4.1. The lower-bound theorem, Theorem 4.1, is 
known as the "verification theorem" in literature. 

Step 3 is the most critical step in the three-step approach. In order to make the relative 
value function smoothly extendible to the entire real line, the parameters of the control 
band must be carefully selected. These parameters serve as the boundary points of the 
ODE and they themselves need to be determined. The smoothness requirements impose 
conditions of the ODE solution at these yet to be found boundary points. Thus, the ODE 
in step 3 is known as the free boundary ODE problem. Solving the free boundary problem 
to find the optimal parameters is also known as the "smooth pasting" method [9]. Solving 
a free boundary problem is often technically difficult. The number of free parameters of 
an optimal control band policy dictates the level of difficulty in solving the free boundary 
problem. Many papers in the literature left it unsolved (e.g. [15, 27]), assuming there is a 
solution to the free boundary problem with a certain smoothness property. 

Contributions 

The Brownian inventory control problem is now a classical problem, starting from Bather [3] 
thirty five years ago. We will survey the research area in the next several paragraphs. In 
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addition to providing a self contained tutorial on the lower-bound approach to studying 
optimal control problems, our paper contributes significantly in the following areas, (a) 
Under a general convex holding cost function with some minor assumptions, we rigorously 
prove the existence of a control band policy that is optimal for both the impulse and singular 
control problems under the long-run average cost criterion, (b) Under the general convex 
holding cost function, we have proved the existence of a solution to the four-parameter 
free-boundary problem. Our existence proof leads naturally to algorithms for computing 
optimal control band parameters. These algorithms reduce to root findings for continuous, 
monotone functions. Thus, the convergence of these algorithms are guaranteed. We are 
not aware of any paper that proved the existence of a solution to the four-parameter free 
boundary problem under the long-run average cost criterion. In the discounted setting, 
[13] solved the four-parameter free boundary problem when h is linear, and [1] solved 
the problem when h is quadratic. Recently, Feng and Muthuraman [17] developed an 
algorithm to numerically solve the four-parameter free boundary problem for the discounted 
Brownian control problem. They illustrate the convergence of their algorithm through some 
numerical examples. However, the convergence of their algorithm was not established, (c) 
Under the long-run average cost criterion, our lower-bound approach provides a unified 
treatment for both the impulse and singular control problems, with and without inventory 
backlog. In particular, we do not need to employ vanishing discount approach [16, 21, 28] 
to study the long-run average cost problems. In her book, Stokey [29] summarizes both 
the impulse and instantaneous controls of Brownian motion with a general convex holding 
cost function. She focused on the discounted cost problems, and employed the vanishing 
discount approach to deal with the long-run average cost problems. It is appealing that 
our current paper studies the long-run average cost problem directly, and characterizes the 
optimal parameters directly without going through the vanishing discount procedure. 

Literature Review 

The lower-bound approach were used in [23, 31] under a long-run average cost criterion 
and in [19, 20] under a discounted cost criterion. The approach is essentially the same 
as the quasi-variational inequality (QVI) approach that was pioneered by Bensoussan and 
Lions [6]. The QVI approach was systematically developed in a French book that was later 
translated into English (see [7]). An appealing feature of the QVI approach is that it is 
sufficient to solve a QVI problem in order to obtain an optimal policy for an inventory 
control problem, and this sufficiency is established in [7]. The QVI problem is a pure 
analytical problem that is closely related the free boundary problem. Many authors directly 
start with the QVI problems, relying on the "verification theorem" developed in [7]; see for 
example, [4, 5, 8, 30]. The potential drawback of this approach is that when the formulation 
of a Brownian control problem is slightly different from the setting in [7], one may have 
to developed a new verification theorem, presumably mimicking the development in the 
book. In contrast, our lower-bound approach allows us to provide a self-contained, rigorous 
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proof simultaneously for impulse, singular and mixed control problems. It also allows one 
to directly see how the smooth requirement of a solution to the free-boundary problem is 
used. We believe the lower-bound approach is easier to be generalized to high dimensional 
Brownian control problems. 

The impulse control problem with both upward and downward adjustments was studied 
as early as 1976 and 1978 in two papers by Constantinides [12] and Constantinides et al. [13]. 
The first paper studies the long-run average cost objective and the second paper studies the 
discounted cost objective. Both papers assume the holding cost function is linear as given 
in (1.1). Under this holding cost function, the optimal control band parameters can be 
explicitly characterized. Baccarin [1] studies discounted impulse Brownian control problem 
with quadratic inventory cost function. When the inventory is restricted to be nonnega- 
tive, but still under the linear holding cost assumption (1.1), Harrison et al. [19] studies 
the discounted cost, impulse Brownian control problem whereas Ormeci et al. [23] studies 
the long-run average cost problem. Under the linear holding cost function assumption, the 
optimal policy is a degenerate control band policy {0, D, U, u}, where three optimal param- 
eters D, U, u can be determined explicitly. However, under our general convex holding cost 
assumption, the optimal policy for the impulse control problem without inventory back- 
log is again a control band policy {d, D,U,u}, with d sometimes being strictly positive. 
Harrison and Taksar [20] and Taksar [31] study the singular Brownian control problem 
under a general convex inventory cost function assumption. The former paper studies the 
discounted cost problem and the latter studies the long-run average cost problem. Tak- 
sar [31] characterizes the optimal control band parameters through the optimal stopping 
time to a stochastic game without solving the two-parameter free boundary problem. As 
in [31], Stokey [29] characterizes her optimal parameters through a stopping time problem 
without solving the four-parameter free boundary problem. These stopping time character- 
izations do not easily lead to any numerical algorithm to compute two optimal parameters. 
Richard [27] studies an impulse control of a general one-dimensional diffusion process. He 
assumes without proof the existence of a solution to a quasi- variational inequality problem 
with certain regularity property in order to characterize an optimal policy. Kumar and 
Muthuraman [22] develop a numerical algorithm to solve high-dimensional singular control 
problems. Vickson [33] studies a cycling problem with Brownian motion demand. 

In his pioneering paper, Bather [3] studies the impulse Brownian motion control prob- 
lem without downward adjustment, under the long-run average cost criterion. For most 
inventory problems, without downward adjustment is a natural setting. Under a general 
holding cost function, he suggests that an (s, S) policy is optimal and derives equations 
that characterize the optimal parameters s and S. Many authors have generalized this 
paper to various settings, to discounted cost problems with linear holding cost in [30], to 
discounted cost problems with and without inventory backlog in [11], to discounted cost 
problems under the general convex holding cost function assumption in [4], to discounted 
cost problems with positive constant leadtime in [2], to compound Poisson and diffusion de- 
mand processes in [5, 8]. Because there is no downward adjustment in these problems, the 
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optimal policy has two parameters and the resulting two-parameter free boundary problem 
can be solved much easier than the four-parameter one. 

Paper Organization 

The rest of this paper is organized as follows. In Section 2, we define our Brownian control 
problem in a unified setting that includes impulse, singular and mixed impulse-singular 
controls. In Section 3 we present a version of Ito formula that does not require the test 
function / be function. A lower bound for all feasible policies is established in Section 
4. Section 5 devotes to impulse control problems that allow inventory backlog under the 
long-run average cost criterion. Section 5.1 shows that under a control band policy, a 
Poisson equation can produce a solution that gives both the long-run average cost and the 
corresponding relative value function. Under the assumption that a free-boundary problem 
has a unique solution that has desired regularity properties, Section 5.2 proves that there 
is a control band policy whose long-run average cost achieves the lower bound. Thus, the 
control band policy is optimal among all feasible policies. Section 5.3 is a lengthy one 
that devotes to the existence proof of the solution to the free-boundary problem. In the 
section, the parameters for the optimal control band policy are characterized. Section 5.3 
constitutes the main technical contribution of this paper. Section 6 solves the singular 
control problem. This section is short, essentially becoming a special case of Section 5 when 
both K = and L = 0. Section 7 deals with impulse control problems when inventory 
is not allowed backlogged. Finally, Section 8 summarizes this paper and discusses a few 
extensions. 

2 Brownian Control Models 

Let X = {X(t),t > 0} be a Brownian motion with drift /i and variance a^, starting from 
X. Then, X has the following representation 

X{t) = X + fit + aW{t), t > 0, 

where W = {W{t),t > 0} is a standard Brownian motion that has drift 0, variance 1, 
starting from 0. We assume W is defined on some filtered probability space (fi, {J~t}, J~, IP) 
and W is an {J-f }-martingale. Thus, W is also known as an { J-f }-standard Brownian 
motion. We use X to model the netput process of the firm. For each t >0, X{t) represents 
the inventory level at time t if no control has been exercised by time t. The netput process 
will be controlled and the actual inventory level at time t, after controls has been exercised, 
is denoted by Z{t). The controlled process is denoted hy Z = {Z{t),t > 0}. With a slight 
abuse of terminology, we call Z{t) the inventory level at time t, although when Z{t) < 0, 
|Z'(t)| is the backorder level at time t. 

Controls are dictated by a policy. A policy if is a pair of stochastic processes {Yi,Y2) 
that satisfies the following three properties: (a) for each sample path oj € fi, Yi{u},-) € D, 
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where B is the set of functions on ]R+ = [0, oo) that are right continuous on [0, oo) and have 
left hmits in (0, cx)), (b) for each u, Yi{u}, •) is a nondecreasing function, (c) Yi is adapted 
to the fihration {J^t}, namely, Yi{t) is J-t-measurable for each t > 0. We call Yi{t) and 
Y2{t) the cumulative upward and downward adjustment, respectively, of the inventory in 
[0,t]. Under a given policy (Yi, I2), the inventory level at time t is given by 

Z{t)=X{t)+Yiit)-Y2{t) = x + aW{t) + fit + Yi{t)-Y2it), t>0. (2.1) 

Therefore, Z is a semimartingale, namely, a martingale aW plus a process that is of 
bounded variation. 

A point t > is said to be an increasing point of Yi if Yi{s) — Yi{t—) > for each 
s > t, where Yi(t— ) is the left limit of Yi at t with convention that Yi(0— ) = 0. When t 
is an increasing point of Yi, we call it an upward adjustment time. Similarly, we define an 
increasing point of Y2 and call it a downward adjustment time. Let Ni{t) be the cardinality 
of the set 

{s € [0,t] : Yi increases at s}, i = 1,2. 

In general, we allow an upward or downward adjustment at time t = 0. By convention, we 
set Z{0—) = X and call Z(0— ) the initial inventory level. By (2.1), 

z(o) = x + yi(o)-y2(o), 

which can be different from the initial inventory level Z{0—). 

There are two types of costs associated with a control. They are fixed costs and propor- 
tional costs. We assume that each upward adjustment incurs a fixed cost of K > and each 
downward adjustment incurs a fixed cost of L > 0. In addition, each unit of upward adjust- 
ment incurs a proportional cost of A; > and each unit of downward adjustment incurs a 
proportional cost of ^ > 0. Thus, by time t, the system incurs the cumulative proportional 
cost kYi{t) for upward adjustment and the cumulative proportional cost iY2{t) for down- 
ward adjustment. When K > 0, we are only interested in policies such that Ni{t) < 00 
for each t > 0; otherwise, the total cost would be infinite in the time interval [0,t]. Thus, 
when K > 0, we restrict upward controls that have a finitely many upward adjustment in 
a finite interval. This is equivalent to requiring Yi to be a piecewise constant function on 
each sample path. Under such an upward control, the upward adjustment times can be 
listed as a discrete sequence {Ti(n) : n > 0}, where the nth upward adjustment time can 
be defined recursively via 

Ti{n) = inf{i > ri(n - 1) : AYi{t) > 0}, 

where, by convention, Ti(0) = and AYi(t) = Yi{t) — Yi{t—). The amount of the nth 
upward adjustment is denoted by 

6(n) = Yi(ri(n)) - yi(ri(n)-) n = 0, 1, . . . . 
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It is clear that specifying such a upward adjustment pohcy Yi = {Yi{t),t > 0} is equivalent 
to specifying a sequence of {(ri(n), ^i(n)) : n > 0}. In particular, given the sequence, one 
has 

7Vi(i) 

Yiit) = E ^i(^)' (2-2) 

and Ni{t) = max{n > : Ti{n) < t}. Thus, when K > 0, it is sufficient to specify the 
sequence {(Ti(n), .^i(n)) : n > 0} to describe an upward adjustment policy. Similarly, 
when L > 0, it is sufficient to specify the sequence {(T2{n),£,2{n)) : n > 0} to describe a 
downward adjustment policy and 

N2{t) 

Y2{t) = X; Ui). (2.3) 

i=0 

Merging these two sequences, we have the sequence {(r„,^„),n > 0}, where T„ is the nth 
adjustment time of the inventory and is the amount of adjustment at time T^. When 
^„ > 0, the nth adjustment is an upward adjustment and when ^„ < 0, the nth adjustment 
is a downward adjustment. The policy (1^1,12) is adapted if T„ is an {J-^j-stopping time 
and each adjustment is =^T„- measurable, 

In addition to the adjustment cost, the system is assumed to incur the holding cost at 
rate h{x): when the inventory level is at Z[t) = x, the system incurs a cost of h{x) per 
unit of time. Therefore, the cumulative holding cost in [0, t] is 

[\{Z{s))ds. 
Jo 

Under a feasible policy <p = {{Yi{t),Y2{t)} with initial inventory level Z{0—) = x, the 
long-run average cost AC{x,ip) is 

1 r /■* 1 

AC(x,(/?) = limsup-E^. / h{Z{s))ds + KNi{t) + LN2{t) + kYi{t) + eY2{t) , (2.4) 

!>oo t ^ Jo 

where Mx is the expectation operator conditioning the initial inventory level Z{0—) = x. 
As mentioned earlier, when K > and L > 0, it is sufficient to restrict feasible policies 
to be impulse type given in (2.2) and (2.3). Such a Brownian inventory control model is 
called the impulse Brownian control model. When K = and L = 0, it turns out the under 
an optimal policy, Ni{t) = 00 and N2{t) = 00 with positive probability for each t > 0. 
The corresponding control problem is called the instantaneous Brownian control model or 
singular Brownian control model. 

In this paper, we make the following assumption on the holding cost function /i : M — >■ 
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Assumption 1. Assume that the continuous holding cost function /i : M — ?> satisfies 
the following conditions: (a) it is convex; (b) there exists an a such that h € C^(M) except 
at a, and h{a) = 0; (c) h'{x) < for x < a and h'{x) > for x > a; (d) When A = ^ / 0, 
h'{x) has smaller order than e""^^, that is 

r \h'{y)\e^^y-''Uy <c5oifA = ^>0 (2.5) 

J-oo 

and 

r\h'{y)\e^^y~''Uy < oo if A = ^ < 0. (2.6) 

■/a cr^ 

We only consider feasible policies that satisfy 

E4yi(i)] < oo i = 1,2, (2.7) 
lEx[A^i(t)] < oo when if > and Ea;[A2(i)] < oo when L > (2.8) 

for each i > 0. Otherwise, AC(x, ip) = oo. In some applications, one might require 
inventory level be nonnegative always, namely. 



Z{t) > 0, for t > 0. 



3 The Ito Formula 



In this section, we first state a version of Ito's formula. We then provide a lower bound 
result for the long-run average cost in (2.4). Recall that for a function g £ D, it is right 
continuous on [0, oo) and has left limits in (0, oo). We use g'^ to denote the continuous part 
of g, namely, 

g^{t) = g{t) - ^9{s) for t > 0. 

0<s<i 

Here we assume ^(0— ) is well defined. Recall under any feasible policy ip = (11,12), the 
inventory process Z = {Z{t) ■ t >0} has the semimartingale representation (2.1). Because 
Brownian motion has continuous sample paths, we have 

Z"(t) = X{t) + Yf{t) - Y2^{t) for t > 0. (3.1) 

Lemma 3.1. Assume that f E C^(M) and f is absolutely continuous such that f'{b) — 
f'{a) = f"{u)du for any a <b with f" locally in . Then 

f{Z{t)) = f{Z{0))+ fTf{Z{s))ds + a f f'{Z{s))dW{s) 

Jo Jo 

f f{Z{s-))dY,^{s)- f f'{Z{s-))dYi{s)+ J2 M{Z{s)), (3.2) 



+ 
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where 
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Tf{x) = -(T^/"(x) + fif'{x), for each x G M such that f"{x) exists, (3.3) 

the generator of the {fj,,a^)-Brownian motion X and f'{Z{s))dW{s) is interpreted as 
the ltd integral. 

Remark. Although f"{u) is only defined on almost all u in M, Jq f" {Z{s))ds is uniquely 
defined almost surely. Indeed, 



a2 f f"{Z{s))ds = \ [ f"{a)L\t)da, 
Jo ^ JR 



where L"" is the local time of Z at a. 

Proof. For any semimartingale Z, it follows from Theorem 71 of [26, pp. 221] and the 
comment of [26, pp. 70] that 

f{Z{t)) = f{Zm+ f nZ{s-))dZ^{s) + \ f f"{Z{s-))d[Z^,Z'^]{s) 

Jo ^ Jo 

+ A/(Z(.)), (3.4) 

0<s<t 

where \Z^, Z^] is the quadratic variation of Z^. Using semimartingale representation (3.1), 
we have 

[Z',Z']{t) = [X,X]{t) = aH (3.5) 

and 

f f'{Z{s-))dZ-{s) = f f'{Z{s-))fids+ f f'{Z{s-))adW{s) 
Jo Jo Jo 

+ r f'{Z{s-))dY{{s) - r nZ{s-))dY,^{s) (3.6) 



for t > 0. Because Yi and I2 have at most countably many jump points, Z has at most 
countably many discontinuity points. Therefore, we have 

f'{Z{s-))ds= f f'{Z{s))ds and f f'{Z{s-))dW{s) = f f'{Z{s))dW{s) 
Jo Jo Jo 

(3.7) 

for all t > almost surely. Ito formula (3.2) then follows from (3.4)-(3.7). 

□ 
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4 Lower Bound 



In this section, we state and prove a theorem that estabhshes a lower bound for the optimal 
long-run average cost. This theorem is closely related to the "verification theorem" in 
literature. Its proof is self contained, using the Ito lemma in Section 3. 

Theorem 4.1. Suppose that f € C"'^(]R) and f is absolutely continuous such that f" is 

locally . Suppose that there exists a constant M > such that \f'[x)\ < M for all x G M. 
Assume further that 

Tf{x) + h{x) > 7 for almost all x € M, (4.1) 

f{y) - fix) <K + k{x-y) for y < x, (4.2) 

fiy) - fix) <L + £iy-x) for x < y. (4.3) 
Then ACix, f)>^ for each feasible policy if and each initial state x G M. 

Remark. (i) When K = 0, condition (4.2) is equivalent to that fix) > —k for each 
X € M. When L = 0, condition (4.3) is equivalent to that /'(x) < i for each x € M. (ii) 
Because under an arbitrary control policy, the inventory level Z can potentially reach any 
level. Thus, we require function / to be defined on the entire real line M. It is not enough 
to have / defined on a certain interval [d, u] . 

Proof Let if = (Yi,l2) be a feasible policy. We choose a version of /"(x) such that (4.1) 
holds for every x € M. By Ito's formula (3.2), 

fiZit)) = fiZiO-))+ fTfiZis))ds + <j f f'iZis))dWis) + f f'iZis-))dY{is) 

Jo Jo Jo 

- f f'iZis-))dYiis)+ AfiZis)) 

•^^ 0<s<t 

> /(Z(0-)) + 7t- fhiZis))ds + a f nZis))dWis)+ f f'iZis-))dY{is) 
Jo Jo Jo 

- f f'{Zis-))dYiis)+ ^/(^(«)) (4-4) 

where the inequality is due to (4.1). In the rest of the proof, we separate into different 
cases depending on the positivity of K and L. We will provide a complete proof for the 
case when K > and L > 0. Sketches will be provided for proofs in other cases. 

Case I: Assume that K > and L > 0. In this case, it is sufficient to restrict feasible 
policies to impulse control policies {(r„, ^„) : n = 0, 1, . . .}. In this case, Yf = and Y2 = 0. 
Conditions (4.2) and (4.3) imply that and A/(Z(T(n))) > -0(^„) forn = 0, 1, . . ., where 

m = 
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Therefore, (4.4) leads to 

fiZ{t)) > /(Z(0-)) + 7t - / KZ{s))ds + a / f\Z{s))dW{s) - V HCn) (4.5) 

Jo Jo 



for each t > 0. Fix an x G M. We assume that 

N{t) 



ej / /i(z(s))ds + V <^(e„) <^ 

V-^O n=0 ^ 



for each t > 0. Otherwise, AC{x, ip) = oo and thus AC{x,Lp) > 7 is triviahy satisfied. 
Because |/'(x)| < M, E^\J^ f'{Z{s))dW{t)\ < 00 and f\Z{s))dW{s) = 0. Meanwhile 

f{Z{t)) < {f{Z{t)))^ 

and E,j:[{f{Z{t))y] is weh defined, though it can be 00, where, for a 6 G M, fe'^ = max(6, 0). 
Taking E^; on the both sides of (4.5), we have 

E4(/(Z(t)))+]>E,[/(Z(0-))]+7t-E, / h{Z{s))ds+Y,<t^{in)]. 
Dividing both sides by t and taking limit as t — > 00, one has 

JV(t) 



lim inf — 

t— i>00 t 



eJ! h{Z{s))ds + Y,<l){^n)] +E,[(/(Z(t)))- 



> 7. (4.6) 



We consider two cases. In the first case when 



liminfiE4(/(Z(t)))+]=0, 
it is clear that (4.6) implies the theorem. Now we consider the case when 

liminf^E4(/(Z(i)))T=6>0. 

t->-oo t 

It follows that for sufficiently large t, 

E4(/(Z(t)))+]>(6/2)t. (4.7) 
Because \f'{y)\ < M, for ah y € M, 

(/(yi))+ - (/(y2))+ < |/(yi) - f{y2)\ < M\yi - ysl < M{\yi\ + 1^2!). 
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Therefore, 

\z{t)\ > ^[f{z{t))+ - ifizmy) - |Z(0)|, 

which, together with (4.7), imphes that 

E,\Zit)\ > ^(E.[(/(Z(t)))+]-E,.[(/(Z(0)))+])-E.|Z(0)| 
> ^((6/2)t-E,[(/(Z(0)))+]) -E,|Z(0)|, 
for sufficiently large t. This implies that 

liminf- / Er,\Z {s)\ds = oo. (4.8) 

t^co t Jq 

Now we prove that 

1 /"* 

liminf- / EJh(Z(s))]ds = oo, (4.9) 

i->00 t Jq ^ 

which implies that KC{x,ip) = oo, thus proving the theorem. 

To see (4.9), by the Assumption (a) and (c), there exist constants hi > Q and c > 
such that 

h'{y) > hi for all y > c and h'{y) < -hi for all y < -c. (4.10) 
Because of (4.8), one of the following two equations holds: 



1, 



liminf -Ea;|^y Z {s)l {s)>c}ds j = oo, (4.11) 

liminf iE,(^^ |Z(s)|l{^(,)<_,|ds^ =oo. (4.12) 

Assume that (4.12) holds. Condition (4.10) implies that 

h{-c) - h{y) < {-hi){-c - y) for y < -c 



or 



Therefore, 
It follows that 



h{y) > hi\y\ + h{—c) — chi for y < — c. 
h{y)l{y<^c} > hi\y\l{y<-c} - chi. 



liminf ^E,. ( f h{Z{s))d^ > liminf ^E,. ( f h{Z{s))l{z{s)<~cY 



oo, 
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which proves (4.9). Hence the theorem is proved for K > and L > 0. 

Case II: Assume that K = and L = 0. Condition (4.2) leads to f'{u) > —k for 
all It € M and condition (4.3) leads to f'{u) < £ for all tt G R. Because / is continuous, 
Af{Z{s)) ^ implies that AZ{s) ^ 0. If AZ{s) > 0, (4.2) implies that 

A/(Z(s)) > -kAZ{s). 

If AZ{s) < 0, (4.3) implies that 

AfiZ{s)) > -lAZ{s). 
Thus, the last three terms in (4.4) is at least 

0<s<t 

> -kY^{t) - £Y.^{t) -k AZ{s) - ^ ^ AZ{s) 

a<s<t 0<a<t 
AZ(s)>0 AZ(s)<a 

= -kY^%t) - iY^^t) - k Y AYi{s) - e AY2{s) 

0<s<t 0<s<t 

> -kYi{t)-£Y2{t). 
Therefore, (4.4) leads to 

f{Z{t)) > f{Z{0-)) + it- f h{Z{s))ds + a f f{Z{s))dW{s) - kY^{t) - £Y2{t) 

Jo Jo 

for t > 0. The rest of the proof is identical to the case when K > and L > 0. 

Case III: Assume K > and L = 0. Consider a feasible policy (Yi,l2) with a finite 
cost. The upward controls must be impulse controls and Yi{t) = l^^=o^ Ci('^)- Condition 
(4.2) implies that 

7Vi{t) 

Y Af{Z{s))>-YiK + kUn))- 

0<s<t n=0 
AZ(s)>0 

and condition (4.3) implies that 

-m{t)+ Y M{Z{s))>-£Y2{t). 

0<s<t 
AZ{s)<0 

Therefore, (4.4) leads to 

l-t ft ^iW 

f{Z{t))>f{Z{Q-)) + -it- h{Z{s))ds + a / f{Z{s))dW{s)-Y{K + kii{n))-£Y2{t) 

Jo Jo 

for t > 0. The rest of the proof is identical to the case when K > and L > 0. 

Case IV: Assume that K = and L > 0. This case is analogous to the case when 
K > and L = 0. Thus, the proof is omitted. □ 
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5 Impulse Controls 

In this section, we assume that K > and L > 0. Therefore, we restrict our feasible pohcies 
to impulse controls as in (2.2) and (2.3). An impulse control band policy is defined by four 
parameters d, D, U, u, where d < D < U < u. Under the policy, when the inventory falls 
to d, the system instantaneously orders items to bring it to level D; when the inventory 
rises to u, the system adjusts its inventory to bring it down to U. Given a control band 
policy (f, in Section 5.1 we provide a method for performance evaluation. As a byproduct, 
we also obtain the relative value function associated with the control band policy. Then in 
Section 5.2 we show that an optimal policy is a control band policy and present equations 
that uniquely determine the optimal control band parameters {d* , D* ,U* ,u*). 



5.1 Control Band Policies 

We use {d, D,U,u} to denote the control band policy associated with parameters d, D, 
U, u. Let us fix a control band policy (p = {d, D,U,u} and an initial inventory level 
Z(0— ) = X. The adjustment amount ^„ of the control band policy is given by 

D — X, if X < d, 
0, if d < X < u, 

U — X, if X > n. 



and for n = 1, 2, 




if Z{Tn-) = d, 

if Z(T„-) = u, 

where again Z[t—) denotes the left limit at time t, Tq = and 

Tn = inf{i > r„_i : Z{t) G {d,u}] 

is the nth adjustment time. (By convention, we assume Z is right continuous having left 
limits.) Our first task is to find its long-run average cost. We first present the following 
theorem. 

Theorem 5.1. Assume that a control hand policy ip = {d, D, f/, u} is fixed. If there exist 
a constant 7 and a twice continuously differentiahle function V : [d,u\ ^M. that satisfies 

TV{x) + h{x) = 7, d<x<u, (5.1) 

with boundary conditions 

V{d) -V{D) = K + k{D -d), (5.2) 

V{u) -V{U) = L + l{u-U), (5.3) 

then the average cost AC{x, (p) is independent of the starting point x € M and is given by 
7 in (5.1). 
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Remark. Equation (5.1) is known as the Poisson equation. The solution V is known as 
a relative value function associated with the control band pohcy ip. It is unique up to a 
constant. One can evaluate 7 from (5.1) by taking x to be any value in [d,u]. 

Proof. Consider the control band policy if = {d,D,U,u}. Let V he a twice continu- 
ously differentiable function on [d,u] that satisfies (5.1)-(5.3). Because d < Z[t) < n, by 
Lemma 3.1, we have 

r /■* 1 r"^^*^ 1 

E,[V{Z{t))] = E,[V{Z{0))] + E,. [ / TV{Z{s))ds\ + E, [ ^ 0„J , 

■^'^ n=l 

where 0„ = V{Z{Tn)) — V{Z{Tn—))- Boundary conditions (5.2) and (5.3) imply that 
On = V{Z{Tn)) - V{Z{Tn-)) = -0(^n) forn = 1, 2, . . .. Therefore, 



E,[y(Z(t))] -E,[F(Z(0))] 



E„. 



N{t) 

rV{Z{s))ds\ +E,[^0„] 



n=l 



7t-E^ 



/ h{Zis))ds -Ejy^<P{^n) 



Dividing both sides by t and letting t — )• oo, we have AC{x, ip) = j because 



lim -E^[V{Z{t))] = and E,[V{Z{0))] = V{x + Co) 

t— >-oo t 



□ 



We end this section by explicitly finding a solution (V,^) to (5.1)-(5.3). The solution 
V is unique up to a constant. In the following proposition, let 



A = 2^/cj2. 

Proposition 1. Let = {d,D, U,u} be a control band policy with 

d < D < U < u. 
Let m G M 6e any fixed number. Define 



(5.4) 



V{x) 



9{y)dy 



with 



g{x) = 1/'(m)e^("^-^) + 7- 



m 



h{y)e^(y-^Uy, 



(5.5) 



16 



where 

ai (c2 + L + l{u - U)) + 02 (ci + + k{D - d)) 

7 = r— 7 (5.6 

«20l + ai02 

, h{c2 + L + iiu-U))-b2{ci+K + kiD-d)) 

V (m) = ^ ; -. (5.7) 

a2bi + aib2 

Then (V,^) is a solution to (5.1)-(5.3). In (5.6) and (5.7), we set 

^gA(m-x)^^^ a^= r e.^(^-^)dx, (5.8) 



ai 



u 



bi = -^ r r e^^y-'^'^dydx, 62 = 4t T T e^^^^-^^dydx, (5.9) 



h{y)e^^y~''Uydx, ca = ^ / / h{y)e^^y-''^ dydx. (5.10) 
Proof. Equation (5.1) is equivalent to 



(e^-y'(x))' = ^(7-M^))e"^ 
Integrating over [jn, x] on both sides, we have 

e^'^V'ix) = e^™T/'(m)+7^ r e^ydy - ^ I h{y)e^ydy 



2 / » 2 



or equivalently 

V'{x) = e^(™-^V(m) + 7— / e^^y-^Uy -— h{y)e^^y~''Uy. 



Boundary conditions (5.2) and (5.3) become 



V'{m) /^e^('"-^)dx + 7^ r e^^y~^Uydx 



Jd 



m 



/ / h{y)e^^y-''Uydx-K -k{D-d), (5.11) 



u px 



V'{m) [ e^("^-^)dx + 7^ / / e^^y-'^Uydx 
Ju ^ Ju 



2 



U fX 



h{y)e^^y-''Uydx + L + i{u-U). (5.12) 

lU Jm 

Using the coefficients defined in (5.8)-(5.10), we see the boundary conditions (5.11) and 
(5.12) become 

aiV'{m) - 761 = -(ci + K + k{D- d)), 
a2V\m) + 762 = C2 + L + iiu - U), 
from which we have unique solution for 7 and V'{m) given in (5.6) and (5.7). □ 
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5.2 Optimal Policy and Optimal Parameters 

Theorem 4.1 suggests the fohowing strategy to obtain an optimal policy. We hope that 
a control band pohcy is optimaL Therefore, the first task is to find an optimal pohcy 
among all control band policies. We denote this optimal control band policy by (p* = 
{d* ,D* , [/*,«*} with long-run average cost 7*. We hope that 7* can be used as the constant 
in (4.1) of Theorem 4.1. To find the corresponding / that, together with the 7*, satisfies 
all the conditions of Theorem 4.1, we start with the relative value function V{x) associated 
with the policy 93*. This relative value function V is defined on the finite interval [(i*,M*]. 
We need to extend V so that it is defined on the entire real line M. Given that V{x) is the 
relative value function, it is natural to extend it in the following way 



fix) 



K + k{D* -x) + V{D*) ioix<d*, 

V{x) for X G (5.13) 

^L + l{x-U*) + V{U*) forx>n*. 



Boundary conditions (5.2) and (5.3) ensure the continuity of / at d* and u* . Therefore, 
f C (M) . We are yet to determine the optimal parameters {d* , D* ,U* ,u*). Now we 
provide an intuitive argument on the conditions that should be imposed on the optimal 
parameters. Since we wish / € C^, we should have 

V'{d*) = -k, V'iu*) = L (5.14) 

Also, starting from d* , the system should jump to a -D that minimizes 

K + k{D-d*) + V{D). 

Therefore, at D = D* , k + V'{D) = 0, namely, 

V\D*) = -k. (5.15) 

Similarly, one should have 

V'{U*)=i. (5.16) 

In this section, we will first prove in Theorem 5.2 the existence of parameters d* , D* , 
U* and u* such that the relative value function V corresponding the control band policy 
if = {d* , D* ,U* ,u*} satisfies (5.1)-(5.3), and (5.14)-(5.16). As part of the solution, we are 
to find the boundary points d*, D*, U* and u* from equations (5.1)-(5.3) and (5.14)-(5.16). 
These equations define a free boundary problem. The solution to a free boundary problem 
is much more difficult to be found than the one to a boundary value problem. We then 
prove in Theorem 5.3 that the extension / in (5.13) and 7* = AC{ip*,x) jointly satisfy all 
the conditions in Theorem 4.1; therefore, the control band policy <p* is optimal among all 
feasible policies. 
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To ease the presentation, in the rest of this section, we assume that fx > 0. The 
statement and analysis for the cases /i < and /i = are analogous and are omitted. 

To facilitate the presentation of Theorem 5.2, we first find a general solution V to (5.1) 
without worrying about boundary conditions (5.2) and (5.3). Proposition 1 shows that 
such V is given in the form 

V{x)= r g{y)y for x G [d*, n*], (5.17) 

J m 

where g is given by (5.5) and m is some constant. Since the optimal boundary points 
d*, D* , U* ,u* are yet to be determined, the constant 7 on the right side of (5.1) is also yet 
to be determined. Differentiating both sides of (5.1) with respect to x, we have shown that 
V'{x) = g{x) is a solution to 

Vg{x) + h'{x) = Q for all X e M\ {a}, (5.18) 

In (5.5), we fix m = a and set A = 2'j/{Xa'^) and B = A — V'{m). Noting that X/ fi = 
we have g{x) = gA,B{x), where 

<7ab(x) = A-Be-^^---^ -{X/fz) r h{y)e-^^^-yUy 

J a 

= A-Be-^^""-"'^ -- r h{x-y + a)e-^^y-''Uy. (5.19) 

J a 

To summarize, we have the following lemma. 

Lemma 5.1. For each j4, G M, function g{x) = gA,B{x) is a solution to equation (5.18). 

The following theorem characterizes optimal parameters (d* , D* ,U* ,u*) via solution 
g = gA,B- Figure 1 depicts the function g used in the theorem. 

Theorem 5.2. Assume that the holding cost function h satisfies Assumption 1. There 
exist unique A*, B* , d* , D* , U* and u* with 

d* < xi < D* < U* < X2 < u* 

such that the corresponding g{x) = gA*,B*{x) satisfies 



/ [g{x) + k]dx = -K, (5.20) 

Jd* 

[ [g{x) - £]dx = L, (5.21) 
Ju* 

g{d*)=g{D*) = -k, (5.22) 

g{U*) = g{u*) = L (5.23) 



Furthermore, g has a local minimum at xi < a and a local maximum at X2 > a. The func- 
tion g is decreasing on (— oo,xi), increasing on {xi,X2) and decreasing again on {x2,oo). 
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-k 








Figure 1: There exist xi < X2 such that the function g decreases in {—oo,xi), increases 
in (xi,X2), and deceases again in (x2,oo). Parameters d* , D*, U* and u* are determined 
by g{d*) = g{D*) = -k, g{U*) = g{u*) = i, the shaded area between U* and u* is L, and 
the shaded area between d* and D* is K. In the interval g is the derivative of the 

relative value function associated with the control band policy {d*,D*, U*,u*}. 

If g satisfies all conditions (5.18), (5.20)-(5.23) in Theorem 5.2, V{x) in (5.17) clearly 
satisfies all conditions (5.1)-(5.3) and (5.14)-(5.16). The proof of Theorem 5.2 is long, and 
we defer it to end of this section. 

Theorem 5.3. Assume that the holding cost function h satisfies Assumption 1. Let d* < 
D* < U* < u* , along with constants A* and B* , be the unique solution in Theorem 5.2. 
Then the control band policy ip* = {d* , D* ,U* ,u*} is optimal among all feasible policies. 

Proof Let g{x) be the function in (5.19) with A = A* and B = B* . Let 



Let 7* be the long-run average cost under policy ip* . We now show that V and 7* satisfy 
all the conditions in Theorem 4.1. Thus, Theorem 4.1 shows that the long-run average cost 




X < d*, 

d* < X < u*, 

X > u*. 



Conditions (5.22) and (5.23) ensure that g is C(M). Define 



V{x) 
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under any feasible policy is at least 7*. Since 7* is the long-run average cost under the 
control band policy (p* , 7* is the optimal cost and the control band policy ip* is optimal 
among all feasible policies. 

First, V{x) is in C"^ {{d* , u*)) . Condition (5.20) implies 

V{d*)-V{D*) = K + k{D* -d*) (5.24) 

and (5.21) implies 

V{u*) - V{U*) = L + e{u* - U*). 

Equation (5.18) implies that V satisfies 

TV + h{x) = constant for x € {d* ,u*). 

By Theorem 5.1, the constant must be the long-run average cost 7* under control band 
policy if* . 

Now, we show that V{x) satisfies the rest of conditions in Theorem 4.1. Conditions 
(5.22) and (5.23) imply that truncated function g is continuous in M. Therefore, V G C"'^(M). 
Clearly, V"{x) = for x [d\u*\, and V"{x) = g'{x) for x € {d\u*). Let 

M = sup |(7(a;)|. 

x(i[d* ,u*] 

We have |1^'(2;)| < M for all x € M. Because 

ry + /i(x)=7* for X G ((i*,n*), 
(4.1) is satisfied for x G {d*,u*). In particular 

^a^g'{d*) + Md*) + h{d*)=l* 

and 

^a^g'{u*)+f,g{un + h{u*) = Y. 

It follows from part (b) and part (c) of Lemma 5.2 in Section 5.3 that d* < xi < a < 
X2 < u*, g'{d*) < and g'iu*) < (see Figure 1). Thus, we have fig{d*) + h{d*) > 7* and 
fig{u*) + h{u*) > 7*. Now, for x < d*, rV{x) + hlx) = ^i{-k) + h{x) > fig{d*) + h{d*) > 7*. 
Similarly, for x > u*, TV{x) + h{x) = fi{i) + h{x) > fig{u*) + h{u*) > 7*. 
Now we verify that V satisfies (4.2). Let x,y G M with y < x. Then, 

V{x)-V{y) + k{x-y) = r [g{z) + k]dz 

> / [g{z) + k]dz 

J{yVd*)AD* 

> [ [giz) + k]dz 
Jd* 

= -K, 
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where the first inequality follows from g{z) = g{z) = —k for z < d* and g{z) = g{z) > —k 
for D* < z < u* and g{z) = i > —k for z > u*, and the second inequality follows from the 
fact that g{z) = g{z) < —k for z € [d*,D*]] see, Figure 1. Thus (4.2) is proved. 
It remains to verify that V satisfies (4.3). For x,?/ G M with y > x. 

V{y)-V{x)-l{y-x) = ^ [g{z) - t]dz 



Ayhu*)wU* 
J{xVU*)AW 



U* 



< 

= L, 

proving (4.3). □ 



5.3 Optimal Control Band Parameters 

This section is devoted to the proof of Theorem 5.2. We separate the proof into a series 
of lemmas. Throughput of this section, we assume that /.i > and that the holding cost 
function h satisfies Assumption 1. Recall the A defined in (5.4). 
Define 

B = -^p h'{y)e^^y-^'^dy. (5.25) 

Because h'{x) < for x < a, B > 0. For A, i? G M, recall the function gA,B defined in 
(5.19). We sometime use the fact that 

gA,B {x)=A + go^B (x) for x € M. (5.26) 

When the context is clear, we simply use g to denote gA,B- For the following lemma, 
readers are referred to Figure 1. 

Lemma 5.2. (a) For any A G M and for each fixed B G (0, B), gA,B attains a unique 
minimum in {—oo,a) at xi = xi{B) G (— oo,a). The function gA,B attains a unique 
maximum in (a, oo) at X2 = X2{B) G (a,oo). Both xi{B) and X2{B) are independent of A. 

(h) For each fixed B G (0, B), the local minimizer xi = xi{B) is the unique solution in 
(— oo,a) to 

B- - r h'{y)e^^y~^Uy = 0. (5.27) 

A* J a 

The local maximizer X2 = X2{B) is the unique solution in (a, oo) to (5.27). 

(c) For each B G (0,S), b(^) < f^'^ ^ ^ {—oo,xi{B)), g'An{x) > for x G 
{xi{B),X2{B)), and g'A^^i^) < for x e (x2(-B),oo). 
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Proof. Differentiating g{x) = gA,B{x) in (5.19) and noting h{a) = 0, we have 

g'{x) = XBe-^^""-"^ -- r h'{x-y + a)e-^^y-''Uy (5.28) 

J a 

= X(^B-^J^ /i'(y)e^(^-")(iy)e-^(^-") 
= AFi(S,x)e-^(^-'^\ 

where, for x € M, 

Fi{B,x) = B-- [ h'{y)e^^y-''Uy. 
^ J a 

Clearly g'{x) = if and only if Fi{B, x) = 0. Because 



(5.29) 



^Fi(5,x) = --/i'(rE)e^(^-'^) (5.30) 
ox fl 

and h'{x) < for x < a and h'{x) > for x > a, we have that Fi{B, x) increases in x < a 
and decreases in x > a. For S > 0, we have 

Fi{B,a) = B >0. 

For any B G (0,5), 

lim Fi{B,x) = B-B <0. 

xi^—oo 

Therefore, there exists a unique xi = xi{B) G (— oo,a) such that Fi{B,xi) = or equiva- 
lently g'{xi) = 0. Also, for any fixed B 

lim Fi{B,x) = —oo. 

xt+oo 

Therefore, for any B > 0, there exists a unique X2 = X2{B) £ (a, oo) such that Fi{B, X2) = 
or equivalently g'{x2) = 0. For B G (0, -B), it is clear that 

g'{x) < for X G ( — oo,xi), g'{x) > for x G (xi,X2) and g'{x) < for x G (x2,oo). 

Thus the lemma is proved. □ 

Remark. The local maximizer X2{B) is well defined for all B G (0, 00), whereas the local 
minimizer xi{B) is defined only for B G {0,B). 

Lemma 5.3. (a) The local minimizer xi{B) is continuous and strictly decreasing in B £ 
{0,B). The local maximizer X2{B) is continuous and strictly increasing in B £ (0,oo). 
Furthermore, 

\i-mxi{B) = a i = l,2 (5.31) 

54,0 



23 



and 



lim = — oo and lim a;2(-B) = X2{B) S (a, oo). (5.32) 

B^B B^B 



(b) For each B € (0,5), 

gA,B{xi{B)) =A- -h{x^{B)) for i = l,2. (5.33) 

Proof, (a) Recall the function Fi defined in (5.29). Obviously, Fi, are continuous, 

and is given in (5.30). One has 

dFi 

— — > for X e (— oo, a), 
ox 

where we have used the fact that h'{x) < for x S (— oo,a). Using the Implicit Function 
Theorem, xi{B) is continuously differential in G (0, i?), and 

= ^ < f5 341 

dB /i'(xi(B))e^(-i(^)-") ^ ^ ^ 

Thus, xi{B) is strictly decreasing in i? G (0,-B). Similarly, we have 

dx2{B) /i 



dB /i'(2;2(B))e^(^2(^)-'^) 



> (5.35) 



proving that X2{B) continuously differential and strictly increasing in S G (0, cxj). The 
limits in (5.31) and (5.32) can be proved easily following the definition of xi{B) and X2{B). 
(b) We have from (5.19) and (5.27) that 



gA,B{xi{B)) 



-X(y-a) 



dy 



A - Se-^(^'(^)-'^) - - / h{xi{B) -y + a)e 
fJ- J a 

1 i-Xi{B) \ /•Xi{B) 

A-- h'{xi{B)-y + a)e-^^y~"'Uy-- h{xi{B) - y + a)e~^'^y~''Uy 

Ja Ja 



\(y-a) 



= A / h{xi{B) - y + a)e- 

fJ- J a 

1 r rXi{B) 

+- h{x,iB) -y + a)e-^^y-''^ +A / h{x^{B) -y + a)e-^^y-''Uy 

= A--h{x,{B)), 

thus proving (5.33). □ 
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In the following lemma, we set for each B € (0, oo), 

A{B) = £ - goMMB)) = h{x2{B))/fi + t (5.36) 
For any B S (0,oo), following (5.26), we have 

gAM^B)) > I (5.37) 
for any A > A{B). Similarly, for any B € {0,B), we define 

A{B) = -k- go,B{xi{B)) = h{xi{B))/fi - k. (5.38) 
Following (5.26), we have 

gAMMB)) < -k (5.39) 
for any A < A{B). Our next lemma determines when A{B) < A{B). 
Lemma 5.4. For each B € (0, B), let 

~g{B) = gAMMB)) - gAMMB)) (5.40) 

be the distance between the local maximum and the local minimum. Then g{B) is indepen- 
dent of A. The function g{B) is continuous and strictly increasing in B ^ (0, B) with 

\\\n.g{B) = and \img{B) = +oo. (5.41) 

Thus, there exists a unique B_i G (0,-B) such that 

g{Bi) = k + i. (5.42) 

For each B £ {B^,B), 

A{B) < A{B). (5.43) 

Proof By (5.26), g{B) = go,B{x2{B)) - go,B{xi{B)). Thus, g{B) is independent of A. It 
follows from (5.34) that for B G {0,B) 

= _Q-H^2{B)-a) ^-X{xi{B)-a) (5.44) 

> 0. 

Thus g{B) is strictly increasing. The limit (5.41) follows from (5.31) and (5.32). The 
existence of unique B_^ satisfying (5.42) follows from (5.41), the continuity and monotonicity 
of g. Inequality (5.43) follows from the definition of B^i and the fact that A{B) — A{B) = 

~g{B)-{£+k). 

□ 
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Lemma 5.5. (a) For each B € {B_i,B) and each A E (^A{B), A{B)], there exist unique 
U{A,B) andu{A,B) with 

xi{B) < U{A, B) < X2{B) < u{A, B) (5.45) 

such that 

gAAU{A, B)) = gA,B{<A^ B)) = £, (5.46) 

g'A,B{UiA B)) > 0, g'A,BHA, B)) < 0, (5.47) 

gA,B{xi{B)) < -k. (5.48) 

(b) For each fixed B € {B_i,B), U{A,B) and u{A, B) are continuous differentiahle function 
in A ^ [A(B),A{B)^. The function U{A,B) is decreasing in A and the function u{A,B) 
is increasing in A. 

Proof (a) For each B G {Ri,^) and each A G {A{B),A{B)], we have gA,B{x2{B)) > i 
and gA,B{xi{B)) < —k. Thus, there are unique U{A,B) and u{A,B) that satisfy (5.46)- 
(5.47). When A G (^A{B) ,A{B)) , the inequahty (5.48) holds. This inequahty imphes that 
U{A,B) > xi{B), which in turn imphes that inequahty (5.45) holds, 
(b) Using the Implicit Function Theorem, we have 

d 1 
—u{A,B) = — - — , , ^ > 0. 

This proves part (b) of the lemma. □ 
Fix a B G iB^,B). For A G {A{B),A{B)) let 

'•U{A,B) 
lu{A,B) 

We would like to show that there exists a unique A*{B) G {A{B),A{B)) such that 

A2{A*{B),B) = L. (5.50) 

Lemma 5.6. Fix a B ^ {B_i,B). The function A2{A,B) is continuous and strictly in- 
creasing in A £ {^A{B),A{B)^. Furthermore 

lim A2(A,B) = 0. (5.51) 

AlAiB) 



rU{A,B) 

A2{A,B)= / [gA,B{^) - ^]dx. (5.49) 

Ju(A,B) 
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Proof. By the Implicit Function Theorem, we have 

^J-^ = ' [gA,BHA,B)-l)] )-^[g^,,(u{A,B)-l)] 

ru{A,B) 

+ / Idx 

Ju{A,B) 

= u{A,B) -U{A,B) 

> 0. (5.52) 

Therefore A2{A,B) is strictly increasing in ^4 G {A(B),A{B)). 
Observe that 

hm gA,B{x2{B))= hm [A - -h{x2{B))] = I. 
AIA(B) AIA(B)'- H 

By the definitions of U {A, B) and u{A, B), we have 

hm U(A,B)= hm u(A,B) = X2(B), 
AiA{B) AiA{B) 



which proves (5.51) 



□ 



Lemma 5.7. The function A2{A{B),B) is continuous and strictly increasing in B £ 
{B_i , B) . Furthermore, 

hm A2(A(S),5) = and hrn A2(^(5), S) = oo. (5.53) 

BiB_i B^B 

Therefore, there exists a unique B_2 € {B_i,B) such that 

M(A{B2),B2) = L and A2(A{B),B) > L for B e (^2,5). (5.54) 

Proof. We first check that A2{A{B),B) is strictly increasing in S € {B_i,B). To see this, 
it follows from (5.34) and the definition of A{B) in (5.38) that 

^ = -h'iMB))'-^ = e-(-(-)-"), (5.55) 
cli3 fj, at) 
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which imphes 

dK2{A{B),B) 



(5.56) 



dB 

du{A{B),B). , , , r.^ ,M du{A{B),B)dA{B). , , , ,m 

Ju{A,B) dB 
u{A,B) 

[_g-A(:.-a) ^g-A(xi(S)-a)]^^ 

C/(yl,B) 
>0, 

where the last inequaUty is due to xi{B) < U{A,B) < u{A,B) for B € {B_i,B). Thus, we 
have proved that A2{A{B), B) is strictly increasing in i? G {B_i,B). 
Because g{B_^) = k + i, we have 

^(^l) = 9AiB,lB,{x2iB,)) - e = gAiB,),B,iMBi)) +k = AiB,). 

Thus, 

hm U(A{B),B) = hm u(A{B), B) = x^iR,)- 

B\.B_-^ B4,Bj^ 



It follows that 



\im Ai{A{B),B) = 0. (5.57) 

B],B^ 



We now show that 

limAi(A{B),B) = oo. (5.58) 

B-tB 

It is clear that (5.57), (5.58) and the monotonicity imply the existence of a unique S 
{B-i^,B) that satisfies (5.54). 

To prove (5.58), one can check that 

dU(A(B),B) e-A(C/{A(B),B)-a) _ g-A(xi(B))-a) 

du{A{B),B) _ e-^W^(-S),B)-a) _ g-A(xi(B))-a) 

Therefore, U{A(B),B) decreases in B and u{A(B), B) increases in B. Thus, 
U(A{B),B) < U(A{B^),B^) and u(A{B),B) > u(A{B^),B^) 
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as B. Therefore, noting that ctA{B),Bi^) - ^ > for x G (U {A{B) , B) , uiA{B) , B)) , we 
have 



_ i-u{A{B),B) 

limA2iA{B),B) = lim I 

BfB BtB Ju{A{B),B) 

ru(A{B^),B^) ^ 

> Hm / 

BfB Ju(A{B^),B^) 

ru(A{B^),B,) , 



9a{B),b(^) ^ 



dx 



9A{B),Bi^) ^ 



dx 



Hm 

BtB JU(A(B_^),B^) 



A{B) + go,Bix) - I 



dx 



{U{AiB,),B,)-uiA{B,),B,)) lim AiB) 

B^B 



+ 



OO, 



U(A{B^),B^) 



dx 



where we have used the fact that 



lim A{B) = limhi{xi{B))/n - k = oo. 

BfB BfB 



□ 



Lemma 5.6 and the inequahty in (5.54) immediately imply the following lemma 
Lemma 5.8. For each B G [B_2,B), there exists a unqiue A*{B) G {A{B),A{B)] such that 

A2{A*{B),B) = L. (5.61) 

Finally, we prove the following lemma, which in turn proves Theorem 5.2. 
Lemma 5.9. There exist unique B* G {B_2,B), d* and D* that satisfy 

d* < xi{B*) <D* < U{A*{B*),B*), 

gA*{B*),B*id*) = 9a*(b*),b*{D*) = -k, 

9'A'iB*),B*idl < 0, g'A*^B*),B'(Dn > 0, 
rD 



/ [aA*{B*),B* (x) + k]dx = -K. 

Jd* 



Proof. By Lemma 5.8 and the inequality in (5.54), for B G {B_2,B), we have A*[B) G 
{A{B),'A{B)). It follows from (5.39) that 

9A*(B)A^i{B)) = A*{B) - h{xi{B)) < -k. 
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Therefore, there exist unique d{B) and D[B) such that 

d{B) < xi{B) < D{B) < U{A*{B),B), 
gA*{B),B{d{B)) = gA*{B),B{D{B)) = -k, 

9'A*iB),BidiB)) < 0, <7V(B),i?P(^)) > 0. 

l-DiB) 

Ai{A*{B),B)= [gA*iB)M^) + k]dx. 

Jd(B) 



Let 



(5.62) 



We are going to prove that Ai{A*{B), B) is continuous and strictly decreasing in S E 
\B2,B) and 

hm Ai{A*{B),B) = and Urn Ai(^*(S), B) = -oo. 

BiR2 B^B 

Therefore, there exists a unique B* € (-Bg, B) such that 

Ki{A*{B*),B*) = -K, 

from which one proves the lemma. 

To prove that Ai{A*{B),B) is continuous and strictly decreasing in i? £ [B_2,B), we 
apply the Implicit Function Theorem to (5.61). We have 



dA 



(B) JU{A*(B),B) ^ 



dB u{A*{B),B) -U{A*{B),B) 
Equation (5.63) yields that, for x G [d{B), D{B)] 



> 0. 



(5.63) 



dgA'{B),B{^) _ dA*{B) 



-X(x—a) 



dB 



dB 



ru{A*(B),B) 
JU(A*(B),B) 




dy 


u{A*{B] 


,B)-U{A*{B),B) 



< 0. 

This in turn implies that 

dA,{A*{B),B) _ /■^(^) 55^*(B),s(x) 



dB 



d{B) 



dB 



dx < 0. 



(5.64) 



Therefore, Ai{A*{B),B) is strictly decreasing in B € \B2,B). 
It follows from (5.54) and Lemma 5.8 that 



A*{B2) = A{B2). 
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This, together with the definition of A{B) in (5.38), shows that 

9AHB,),B,{MB2)) = 9AiB,),B,iMB2)) = A{B^) - -^h{xi{B^)) = -k. (5.65) 



Therefore, we have 



It follows that 



It remains to prove 



For B G iB2,B) 



lim D{B)=hm d{B) = x^{B^). 

BiBj2 BIB_2 



lim KM*{B),B) = Q. (5.66) 

B]rB_2 



\\mKi{A* {B),B) = -oo. (5.67) 



dgA*{B),BMB)) _ dA*{B) -X{MB)~a),, (^(m^^^^^ 
dB - -dB--' '+9A^iB)A^^m—^ 



ru{A'{B),B) 
IU(A*(B),B) 



-\(y-a) _ -\{xi{B)~a) 



dy 



u{A*{B),B) - U{A*{B),B) 

< 0, 

which, together with (5.65), implies that 

9a*{b),b{MB)) < -k (5.68) 
for each B e {B2,B). Fix a B3 G {B2,B) and let 

Mi = (^-k-gA^^),B^{xi{B,))')/2. 

It follows from (5.68) that Mi > 0. Then for each B G {B^,B), 

gA*[B)A^i{B)) < 9A*{B,),B,iMB3)) = -k- 2Mi <-k- Ml. 
Therefore, for each B G {B_-^, B) there exist unique di{B) and Di{B) such that 

di{B)<xi{B)<Di{B), 

9AHB)Mdi{B)) = 9A*(B)ADi{B)) = -k- Ml, (5.69) 
9'a*(b),b{^i{B)) < 0, > 0. 

The properties of g in Lemma 5.2 (see also Figure 1) imply that for each B G {B^,B) 
d{B) < di{B) < xi{B) < Di{B) < D{B). 
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This, together with (5.32) impUes that 



limdi{B) = -oo. (5.70) 

BtB 

Note that for x £ {d{B), D{B)), gA*(B),B{x) < -k. Therefore, for B £ {R^,B), 

.D{B) 



Ai{A*{B),B) = / [gA*iB),B{^) + k]dx 

Jd(B) 

< / [9A*{B),Bix) + k]dx 

Jdi(B) 



< / [-Mi]dx 
Jd, (B) 



'di{B) 

= -Mi{Di{B)-di{B)). 
It fohows from (5.69) and (5.63) that for each B G {B_^,B), 



ru{A*{B),B) 

dDi{B) Ju{A*{B),B) 



'X{Di{B)-a) _ -\{x-a) 



dx 

> 0. 



dB {u{A*{B), B) - U{A*{B),B))g'{Di{B)) 
Thus, for any B G {B^^,B), 

D,{B)>D,{B,). 

Thus, for any B € {B^^,B), 

Ai{A*{B),B) < Midi{B)-MiDi{R,). (5.71) 
Now (5.67) readily follows from (5.71) and (5.70). □ 

6 Singular Controls 

In this section, we assume that K = and L = 0. Therefore, we restrict our feasible 
policies to singular controls also known as instantaneous controls as in (2.2) and (2.3). A 
two-parameter control band policy is defined by two parameters d, u, where d < u. No 
control is exercised until the inventory level Z{t) reaches the lower boundary d or the upper 
boundary u. When Z{t) reaches a boundary, there is no advantage in using impulse control 
because there is no fixed cost. 
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6.1 Control Band Policies 



Let us fix a two-parameter control band policy ip = {d, u}. To mathematically describe 
the control process (1^,12), we need to use two-sided regulator: for each a; G D with 
x(0) G [d^u], find a triple {yi,y2,z) € D'^ such that 

z{t) = x{t) + yi{t)-y2{t), i>0, (6.1) 

z{t)e[d,u], t>0, (6.2) 

2/1 (0) = 2/2(0) = 0, yi and 2/2 are nondecreasing, (6.3) 

yi and 7/2 increases only when z = d and z = u, respectively. (6-4) 

The precise mathematical meaning of (6.4) is 

/•CO poo 

/ iz{t) - d) dyi{t) = and / {u - z{t)) dy2it) = 0. (6.5) 
Jo Jo 

One can verify that (6.5) is equivalent to the following: whenever z{t) > d for t € [ti,t2]; 
2/1(^2) — yi{ti) = and whenever z{t) < u for t G [ti,t2], 2/2(^2) — 2/2(^1) = 0. Lemma 6.1 
below follows from Proposition 6 in Section 2.4 of [18]. That proposition is stated for each 
continuous path x G D; one can verify that the proposition continues to hold when the 
continuity of x is dropped. 

Lemma 6.1. For each 2; G O with x(0) G [d, u], there exists a unique triple {yi,y2,z) G 
that satisfies (6. l)-(6.5). 

The lemma asserts that the map : x G Bq — > (1/1 , 2/2 , -z) G is well defined, where 
Bq = {x G B : x{0) G [d, ti]}. In the following, we use notation 

yi = ^i{x), 2/2 = ^2(2;), and z = '^3{x). 

The nondecreasing functions (2/1,2/2) are said to be the two-sided regulator of x, and z is 
the regulated path of x. When either ti = 00 or d = —00, the corresponding one-sided 
regular is defined in Section 2.2 of [18]. 

Under the control band policy {d,u} with initial inventory level x G [d,u], the controls 
(li,l2) are given by Yi = ^'i(X), Y2 = ^2(-'^)) and the inventory process Z = '^^{X). 

To find the long-run average cost under the policy if = {d,u}, we use the following 
theorem. 

Theorem 6.1. Fix a control band policy if = {d,u}. If there exist a constant 7 and a 
twice continuously differentiahle function F : [d, n] ^ M that satisfies 

TV{x) + h{x) = 7, d<x<u, (6.6) 
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with boundary conditions 

V'{d) = -k, (6.7) 
V'{u) = i. (6.8) 

Then the average cost AC{x,(p) is independent of the initial inventory level x G IR and is 
given by 7 in (6.6). 

Proof. First we assume x G [d,u]. In this case, Z{0) = x. By Ito's formula, 

V{Z{t)) = V{Z{0))+ [ TV{Z{s))ds + a [ V {Z{s))dW{s) + [ V' {Z{s))dYi{s) 

Jo Jo Jo 

- fv'{Z{s))dY,{s) 

JO 

= y(Z(0)) + / TV{Z{s))ds + a [ V'{Z{s))dW{s) + V'id)Yiit) - V'{u)Y2it) 
Jo Jo 

= y(Z(0))+7t- / h{Z{s))ds + a f V {Z{s))dW{s) - kYi{t) - lY2{t). 



Therefore 

E,.[F(Z(t))] =E,.[F(Z(0))]+7i- (^j\{Z{s))ds + kYi{t)+eY2{t)y 

Dividing both sides by t and taking the hmit as i — >■ 00, we have AC(x, (p) = 7. 

When X ^ [d,u], we assume Z immediately jumps to the closest point in [d,u] at time 
0. Therefore, Z{0) = d ii x < d and Z{0) = u ii x > u. Since Z{0) G [d,u\, the rest of the 
proof is identical to the case when x G [d, u] . 

□ 

Proposition 2. Let (p = {d, u} be a control hand policy with 

d < u. 

Let m G M &e any fixed number. Define 

= / g{y)dy 

J m 

with 

g{x) = y (m)e^("^-") + / ^^^''""^^2/ " ^ / h{y)e^^y-''Uy, 
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where 

di{f2+i) + d2{fi + k) 

7 = J — ^ , (6-9 

0(162 + 0261 

y, ^ e.(A + Q + e.(/.+t)^ 

die2 + 0261 

Then {V,j) is a solution to (6.6)-(6.8). In (6.9) and (6.10), we set 

ei = t e^^y-'^^dy, 62 = ^ r e^^y-^Uy, (6.12) 

/i = -- / /i(y)e^('^-'^)dy, /2 = - / /i(?/)e^(^-")dy. (6.13) 
Proof. Similar to the proof of Proposition 1, equation (6.6) implies that 

y'(a;) = e^(™-^V(m) + 7— / e^^y-^Uy -— h{y)e^^y-''Uy. 
Boundary conditions (6.7) and (6.7) become 

gA{m-d)y/(^) + 7 A r e^'^y-'^Uy f h{y)e^^y~'^Uy = -k, (6.14) 

(•U 



e^('"-"V(m) + 7— / e^^y-''^dy-^ I h{y)e^^y-''Uy = i. (6.15) 

Using the coefficients defined in (6.11)-(6.13), we see the boundary conditions (6.14) and 
(6.15) become 

d^V'{m)-^ei = -{k + h), 
d2V'{m) + 'ye2 =i + f2, 

from which we have unique solution for 7 and V'{m) given in (6.9) and (6.10). □ 
6.2 Optimal Policy and Optimal Parameters 

Theorem 4.1 suggests the following strategy to obtain an optimal policy. We hope the 
optimal policy is a control band policy. Therefore, the first task is to find an optimal 
control band policy among all control band policies. Denote this optimal control band 
policy by <p* = {d*,u*}, d* < u* , with long-run average cost 7*. We hope that 7* can be 
used the constant in (4.1) of Theorem 4.1. To find the corresponding / that satisfies all the 
conditions of Theorem 4.1, we start with the relative value function V{x) associated with 
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the policy ip* that is defined on the interval We need to extend V{x) so that it 

can be defined on M. Given that V{x) is the relative value function, it is natural to define 



fix 

Since we wish f ^ 
We also hope f & 



V{d*) + k{d* - x) for X < d*, 
V{x) for d* < X <u*, 

^V{u*) + i{x - u*) for X > -u*. 

we should have 

V'{d*) = -k, V'{u*)=i. 
we should have the following conditions, 
V"{d*) = 0, V"{u*) = 0. 



(6.16) 



(6.17) 



(6.18) 



In this section, we will first prove the existence of parameters d* and u* such that 
the relative value function V corresponding the control band policy = {d*,u*} satisfies 
(6.6)-(6.8), and (6.17)-(6.18). Since part of the solution is to find the boundary points d* 
and u* , equations (6.6)-(6.8), and (6.17)-(6.18) define a free boundary problem. We then 
prove that the extension / in (6.16) and 7* = AC{(p*,x) jointly satisfy all the conditions 
in Theorem 4.1. 

In the rest of this section, we assume that ^ > 0. The statement and analysis for the 
cases /i < and /i = are analogous and are omitted. Recall the function g{x) = gA^six) 
defined in (5.19). 

Theorem 6.2. There exist unique A*, B* , d* and u* such that g{x) = gA*,B*{x), d* and 
u* satisfy 



9{d*) - 

g'id*) 



0, 
0. 



(6.19) 
(6.20) 
(6.21) 
(6.22) 



Furthermore, g{x) decreases in {—oo,d*), increases in {d*,u*), and decreases again in 
{u* , 00). 

Proof. Recall the definition of B in (5.25). For each B € {0,B), by Lemma 5.3, there is 
a unique local minimizer xi{B) < a and a unique local maximizer X2{B) > a for function 
go,B{x). By Lemma 5.4, there exists a unique B^^ € (0,-B) that satisfies (5.42). Let 

A* = h{xi{B^))/fi + i = h{x2{B^))/fi - k. 

Then g{x) = gA*,B,{x), d* = xi{B^) and u* = X2{B^) satisfy (6.19)-(6.22); see Figure 6.2. 

□ 
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Figure 2: There exist unique d* = xi{B_i) and u* = X2{B_i). 



Now we show that the control band pohcy (p* = {d*,u*} is optimal policy among all 
feasible policies. 

Theorem 6.3. Assume that h satisfies Assumption 1. Let d* and u* , along with constants 
A* and B* , he the unique solution in Theorem 6.2. Then the control hand policy (p* = 
{(i*,M*} is optimal among all feasihle policies. 

Proof. Let g{x), x G M, be the function in (5.19) with A = A* and B = B* . Let 

-k, X < d*, 
g{x) = { g{x), d* <x< u*, 
X > u*. 

Define 

V{x)= rg{y)dy. (6.23) 



d* 



Let 7* be the long-run average cost under policy (p* . We now show that V and 7* satisfy 
all the conditions in Theorem 4.1. Thus, Theorem 4.1 shows that the long-run average 
cost under any policy is at least 7*. Therefore, 7* is the optimal cost and the control band 
policy if* is an optimal policy. Now we check that V{x) is in C^(M) and satisfies (4.1)-(4.3). 
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First, V{x) is in C'^ {[d* , u*]) . Lemma 6.2 and the definition of V in (6.23) imply that 
hm V"{x) = = lim V"{x), and lim V"{x) = = lim V"{x). 

x'\'d* x],d* x'l'u* x],u* 

Then, V"{x) is continuous at d* and u*. Note that V"{x) = in {-oo,d*) and (u*,+oo). 
Therefore, V{x) is in C^{R). Let 

M= sup 

xG[d*,u*] 

we have < M for all x G M. 

To check (4.1), we first find that TV{x) + h{x) = 7* for d* < x < u* . For x < d*, 

rv{x) + h{x) 
2 

= Y^"(x) + ^y'(x) + /i(x) 
2 

= Y^"{d*) + IJ^V'{d*) + h{x) 
2 

> yy"((i*) + ^y'((i*) + /i(d*) 

* 

= 7 , 

where the second equality is because for x < d*, V"{x) = = V"{d*) and V'{x) = —k = 
V'{d*), the inequality is due to x < d* = xi < a, where a again is the minimum point of 
h. Similarly, for x > u*, TV{x) + h{x) > 7*. 

Finally, (4.2) and (4.3) hold because g{x) is strictly increasing in x, x G [d*,u*], and 
g{d*) = g{d*) = -k, g{u*) = g{u*) = I (See Figure 6.2). Thus, the optimality of control 
band policy ip* is implied by Theorem 4.1. 

□ 



7 No Inventory Backlog 

In this section, the inventory backlog is not allowed and thus we add the constraint Z{t) > 
for all t > 0. The holding cost function /i(-) is defined on [0,oo), and a G [0, 00) is its 
minimum point. We focus on the impulse control case when K > and L > 0. Thus, 
this section parallels Section 5. In particular, the results and proofs in this section are 
analogous to that in Section 5. In our presentation, we will highlight the differences. 

For a control band policy {d, D, U, u} with < d < D < U < u, one can continue to 
use Theorem 5.1 to evaluate its performance and to obtain is relative value function. But 
the lower bound theorem. Theorem 4.1, needs to be slightly modified as in the following 
theorem. 
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Theorem 7.1. Suppose that f G C^([0,+oo)) and f is absolutely continuous such that 
f" is locally L^. Suppose that there exists a constant M > such that \f'(x)\ < M for all 
X E [0,+oo). Assume further that 

r/(x) + > 7 /or X e [0, +00), (7.1) 
fiy) - fix) <K + kix-y) forO<y<x, (7.2) 
f{y) - fix) <L + £{y-x) forO<x<y. (7.3) 

Then AC{x, if) >^ for each feasible policy 93 and each initial state x € [0, +00). 
7.1 Optimal Policy Parameters 

Recall that for a given set of parameters {d, D,U,u} with < d < D < U < u, the corre- 
sponding relative value function satisfies (5.1)-(5.3). To search for the optimal parameters 
(d* , D* ,U* ,u*), we impose the following conditions on {d, D,U,u} and V: 



V'iU) = I, (7.4) 

V'{u) = I, (7.5) 

V'iD) = -k, (7.6) 

V'{d) = -k-a, (7.7) 

0<(i<Z)<C/<ti, (7.8) 

ad = 0, and (7.9) 

a > 0. (7.10) 



In some cases, it is optimal to have d* = 0. In such a case, one only needs to solve for 
three parameters D* , U* and u*. This section is analogous to Section 5.2. We highlight 
the differences between these two sections and omit some details to avoid repetition. 

Recall that a is the minimum point of the holding cost function h{x) on [0,oo). It is 
possible a = or a > 0. In the following, whenever Assumption 1 is invoked for h, any 
condition on h{x) with x < is ignored. Similar to Lemma 5.1, we have the following 
lemma. 

Lemma 7.1. For each A,B G M, function g{x) = gA,B{x) in (5.19) is a solution to 
equation 

Tg{x)+h'{x)={) /or a// X € [0, 00) \ {a}, (7.11) 

The following theorem solves the free boundary problem when inventory backlog is not 
allowed. 

Theorem 7.2. Assume that the holding cost function h satisfies conditions (a)-(d) of 
Assumption 1. There exists unique A, B, d, D, U and u with 

< d < D and U < X2 < u 
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(b) Case d = (c) Case d = 



Figure 3: In the nonnegative case, the optimal control band policy has two possible cases: 
d > or d = 0. 



and such that the corresponding g[x) = gA,Bi-c) satisfies 



/ [g{x) + k]dx = -K, (7.12) 

Jd 

[ [g{x) - £]dx = L, (7.13) 
Ju 

g{d) = -k-a, g{D) = -k, (7.14) 

g{U)=g{u) = i, (7.15) 

ad = 0, and (7.16) 

a > 0. (7.17) 



Furthermore, g has a local minimum at xi < a and g has the maximum at X2 > a. The 
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function g is decreasing on (0,xi), increasing on (xi,X2) and decreasing again on (x2,oo). 



We leave the proof of Theorem 7.2 to the end of this section. 

Theorem 7.3. Assume that the holding cost function h satisfies conditions (a)-(d) of 
Assumption 1. Let < d* < D* < U* < u* , along with constants A* and B* , be the 
unique solution in Theorem 7.2. Then the control band policy (p* = {d* , D* ,U* ,u*} is 
optimal among all feasible policies to minimize the long-run average cost when inventory 
backlog is not allowed. 

Proof. The proof is identical to that of Theorem 5.3. □ 

The rest of this section is devoted to the proof for Theorem 7.2. This proof is similar to 
the proof of Theorem 5.2. We provide an outline of the proof for Theorem 7.2, highlighting 
differences between the two proofs. We only consider the case when /_f > 0. Other cases 
are analogous and are omitted. Define 



Because h'{x) < for x G (0, a), Bi > 0. 

The following lemma is analogs to Lemma 5.2. The only difference is that the expression 
for xi = xi{B) has two forms in Lemma 7.2. 

Lemma 7.2. (a) For any A M. and for each fixed B G (0, oo), gA,B attains a unique 
minimum in [0,a] at xi = xi{B) G [0, a]. The function gA,B attains a unique maximum in 
(a, oo) at X2 = X2{B) G (a, oo). Both xi[B) and X2{B) are independent of A. 

(b) For each fixed B G (0, oo), the local maximizer X2 = X2{B) is the unqiue solution in 
(a, oo) to (5.27). For B G (0,i?i), the local minimizer xi = xi{B) is the unique solution 
in (0,a) to (5.27). For B G [^i,oo), xi = xi{B) = 0. 

(c) For each B G (0, oo), g'^^si^) < for x e {0,xi{B)), g'^ si^) > for x € 
{xi{B),X2{B)), and g'^^si^) < /'^'^ ^ ^ {x2{B),oo). 

The following lemma is analogs to Lemma 5.3. 

Lemma 7.3. (a) The local minimizer xi{B) is continuous and nonincreasing in B G 
(0, oo). The local maximizer X2{B) is continuous and strictly increasing in B G (0,oo). 
Furthermore, (5.31) holds and 




(7.18) 



lim xi{B) = and lim X2{B) = oo. 



(7.19) 



(b) For each B G (0,oo) 



gA,B{x2{B)) = A - h{x2{B))/fi. 



(7.20) 
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For each B G (0,Si), 

gAA^i{B))=A-h{xi{B))/fi. (7.21) 

For each B G [i?i,oo), 

QAA^iiB)) = ffA,B(0) =A- Be^- + - r h{-y + a)e-^^y-''Uy. (7.22) 

Jo 

Proof, (a) Note that xi(i?) = for B € [i?i,oo). Thus, xi{B) is continuous for B € 
(-Bi,cx)). It follows the proof of Lemma 5.3 that xi{B) is continuously differentiable in 
B € (0, i?i), and X2{B) is continuously differential in i? € (0,oo). One can easily check 
that xi{B) is continuous at S = -Bi and that Xi{B) has the desired monotonicity property 
for i = 1,2. Limits in (5.31) can be obtained similarly as in Lemma 5.3. The limit in the 
left side of (7.19) follows from xi{B) = for B € (i?i,oo). The limit in the right side of 
(7.19) follows from equation (5.27) for the definition of X2{B). 

(b) Equations (7.20) and (7.21) follow from the proof for (5.33). For B S [Bi,oo), 
xi{B) = 0. Thus, (7.22) follows from (5.19). □ 

Recall the definition g{B) in (5.40) of Lemma 5.4. This time g{B) is well defined for 
B G (0,oo). Recall also the definition of ^(-B) in (5.36) and A{B) in (5.38). We have the 
following lemma that is analogous to Lemma 5.4. 

Lemma 7.4. The function g{B) is independent of A. It is continuous and strictly increas- 
ing on B ^ (0, oo). Furthermore, 

\Ymg[B) = and lim g{B) = oo. 

Therefore there exists B_i G (0,oo) such that (5.42) holds. Furthermore, for B G (^i,oo), 
(5.43) holds. 



Proof. First, we prove g is strictly increasing. For B G {0,Bi), the expression for ^^^^ 
identical to the one in (5.44). For B G {Bi, oo) 



IS 



= _h'{x2{B))^^4^ + e^- = -e-^i-^(^)'-) + e^- > 0. (7.23) 
dB fi dB 

Thus, g is strictly increasing. 

Next we prove limB^oo 9{B) = oo. We observe that (7.23) and (7.19) imply that 

lims-|--oo '^^dB^ = e'^" > 0, from which we have lim^-t-oo oiB) = oo. 

The remaining proof of the lemma is identical to that of Lemma 5.4. 

□ 

With Lemma 7.4 replacing Lemma 5.4, Lemmas 5.5 and 5.6 hold without any modifi- 
cation. 



42 



Lemma 7.5. The function A.2{A{B),B) is continuous and strictly increasing in B G 
{B_i,oo)- Furthermore, 

lim A2(A{B),B) = 0, (7.24) 

lim A2(A{B),B) = 00. (7.25) 

Btoo 

Therefore, there exists a unique B_2 € (^1,00) such that (5.54) holds. 

Proof. The proof of this lemma is identical to the proof of Lemma 5.7 except that we need 
to prove (7.25). 

To prove (7.25), we follow the expression in (5.56) for ^AlI^^LM, gy (5.59) and (5.60), 
we know that U{A{B),B) decreases in B and u{A{B),B) increases in B. Also we know 
that xi{B) = for B € (i?i,oo). Following the expression in (5.56) and these facts, we 
have _ 

lim iMm^M > 0, 

Bfoo oB 

which implies that (7.25). □ 
Lemma 5.6 and Lemma 7.5 immediately gives the following lemma. 

Lemma 7.6. For each B € [^62, 00), there exists a unique A*{B) G {A{B),A{B)] such 
that (5.61) holds. 

Finally, the following lemma gives a proof of Theorem 7.2. 

Lemma 7.7. There exist unique B* with B* G (:B2,oo), D* , d* and a* such that 

9a'{B'),B'{D*)) = -k, 
9A'iB'),B'{d*) = -k- a*, 

/ gA*{B*),B*{x) + k dx = -K, 

Jd* L J 

a*d* = 0, and 
d* > 0. 

where a* >0. 

Proof For any B G (Bg, 00), A*{B) < A{B). Therefore, (5.39) implies that gA*(B),B{xi{B)) < 
—k. Thus, there exists a unique D{B) such that 

D{B)>0, gA'{B)AD{B)) = -k, g'^,^s^,,{D{B))>0. (7.26) 

If xi{B) > and (?a*(b),_b(0) > ~k, then there exists a unqiue d{B) such that 

d{B)>0, gA*{B)MdiB)) = -k, 5a*{b),bK^)) < 0- C^-^^) 
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Inequality (5.34) shows that xi{B) is strictly decreasing in B € (0, -B). Also for B E 
{B_2,oo), (5.63) implies that 



dgA*{B),B{0) _ II 



u(B) 
U(B) 



-X{x—a) gAa 



dx 

— < 0. (7.28) 



dB u{B) - U{B) 

Therefore, gA*{B),B{^) is strictly decreasing in B € {B_2,oo). Let {B_2,B_^) be the interval 
over which gA*{B),B{0) > If there is no B that satisfies gA*(B),B{^) > ~k, then set 
^4 = Thus, for B G (^2,^1 AB4), d{B) > 0. Otherwise, for B £ [BiAB^,oo), we 
set d{B) = 0. The rest of the proof mimics the proof of Lemma 5.9. 

Define Ai{A*{B),B) as in (5.62). We are going to prove that Ai{A*{B),B) is contin- 
uous and strictly decreasing in B £ \B_2, 00) and 

lim Ai(A*(B),B) = and lim Ai(A*(B),B) = -00. 

BiB2 Bfoo 

Therefore, there exists a unique B* G (:S2,oo), such that 

Ai{A*{B*),B*) = -K, 

from which one proves the lemma by choosing A* = A*{B*), D* = D{B*), d* = d{B*) and 
a* = (fc + 5A.(B*),B'(0))-. 

We first show that Ai{A*{B),B) is continuous and strictly decreasing in i3 G [^2)^)- 
Observe that (5.63) continues to hold, from which (5.64) continues to hold. We now claim 
that (5.64) continues to hold for B G (^62, 00) except possibly at Bi A B^. Indeed, for 
B G (^2,^1 A ^4) (5.64) holds as before. For B G (^1 A ^4, 00), d{B) = and (5.64) 
holds as well in this case. This proves that Ai{A*{B),B) is continuous and decreasing in 
B. 

Next, it is easy to see that the limit (5.66) continues to hold as well. It remains to 
prove 

lim Ai{A*{B), B) = -00. (7.29) 

Bfoo 

We will prove next that 

dAi(A*(B),B) 
lim ' < 0, 7.30 

B-[oo dB 

from which (7.29) immediately follows. 
To see (7.30), using (7.26), we have 



dD{B) _ ^ u{A*{B),B)-U{A*{B),B) 

~dB~ - g'^,^^^,,{D{B)) 



(7.31) 



u(A*(B),B) 
U{A'{B),B) 



^~X{D{B)-a) _ g-A(x-a) 



dx 



{u{A*{B),B) - U{A*{B),B)g'^,^j^^^^{D{B)) 
> 0. 
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To study the limit in (7.30), we only need to consider Ai{A*{B), B) for B € [Bi AB_^, oo). 
When B G (Bi A 8^,00), d{B) = and hence Ki{A*{B),B) = J^^^^ [9a-'{B),b{x) + k]dx. 
Therefore, for B G {Bi AB_^,oo), 

dAiiA*{B),B) 



dB 

^^^hdA*{B) 







_ dB 

^^^^rdA*{B) 

dB 



-\{x—a) 



-X{x—a) 



dx + [gA*{B),B{D{B)) + k] 



dB 



dx, 



where the last equality is due to gA*{B),B{D[B)) = —k. It follows from (5.63) that 



dAiiA*{B),B) 
dB 



< 



u{B) - U{B) ^ 



-\{x—a) 



dx 

\{x—a) 



u{B) - U{B) 



dx 



D{B) 



•g-A(D(B)-a)_g-A(x-a)]^^^ 



where the inequality follows from D{B) < U{B). Inequality (7.31) implies that 



_Xe~KD{B)^a)j^^j^~^dD{^^^_ 



dB ^ ' dB 

For any Bq € (BiAB^^oo), we have J^^(^") [e-^C^-^") - e"^(^(^o)~")]dx < 0. Therefore, 



B^oo 



dB 



D{B) 



D(Bo) 



< 



g-A{x-a) _ g-A(D(Bo)-a) 



]dx < 0, 



proving (7.30). 



□ 



8 Concluding Remarks 

In this paper, we have given a tutorial of the lower-bound approach to studying the optimal 
control of Brownian inventory models with a general convex holding cost function. The 
control can be either impulse or singular, and the inventory can be either backlogged or 
without backlog. For future research, it would be interesting to study multi-stage inventory 
systems with Brownian motion demand. Yao [34] has done a preliminary study for these 
systems. 
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